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Abstract 


A  new  technique  for  the  estimation  of  autoregressive  filter  parameters  of  a  non- 
Gaussian  autoregressive  process  is  proposed.  The  probability  density  function  of 
the  driving  noise  is  assumed  to  be  known.  The  new  technique  is  a  two-stage  pro¬ 
cedure  motivated  by  maximum  likelihood  estimation.  It  is  computationally  much 
simpler  than  the  maximum  likelihood  estimator  and  does  not  suffer  from  conver¬ 
gence  problems.  Computer  simulations  indicate  that  unlike  the  least  squares  or 
linear  prediction  estimators,  the  proposed  estimator  is  nearly  efficient,  even  for 
moderately  sized  data  records.  By  a  slight  modification  the  proposed  estimator 
can  also  be  used  in  the  case  when  the  parameters  of  the  driving  noise  probability 
density  function  are  not  known. 


I.  Introduction 


Estimation  of  the  parameters  of  autoregressive  (AR)  processes  has  been  widely 
addressed  [Box  and  Jenkins  1970],  [Kay  1986].  These  processes  are  modeled  by  an 
all-pole  filter  excited  by  a  white  Gaussian  process,  also  referred  to  as  the  driving  noise. 
The  class  of  AR  processes  driven  by  white  non-Gaussian  noise  has  not  received  much 
attention,  although  they  are  capable  of  representing  a  wide  range  of  physical  processes 
[Sengupta  and  Kay  1986].  Previous  research  has  shown  that  it  may  be  possible  to  esti¬ 
mate  some  of  the  parameters  characterizing  a  non-Gaussian  AR  process  more  precisely 
than  those  of  a  Gaussian  AR  process  with  the  same  power  spectral  density  (PSD). 
Specifically,  the  Cramer-Rao  bound  (CR  bound)  for  the  variances  of  the  estimators  for 
the  AR  filter  parameters  [Martin  1982]  is  lower  in  the  non-Gaussian  case  than  in  the 
Gaussian  case  [Pakula  1986],  [Sengupta  1986].  Yet  utilization  of  this  theoretical  result 
has  been  limited.  The  method  of  maximum  likelihood,  which  is  the  most  widely  consid¬ 
ered  approach  for  AR  parameter  estimation,  is  usually  associated  with  computational 
complexity  and  convergence  problems.  In  the  case  of  non-Gaussian  PDF’s  maximiza- 
tion  of  the  likelihood  function  leads  to  a  set  of  highly  non-linear  equations  [Sengupta 
and  Kay  1986]  in  contrast  to  the  Gaussian  case  where  the  equations  become  linear 
after  a  few  simplifying  assumptions.  It  is  the  solution  of  these  non-linear  equations 
which  results  in  the  computational  complexity  of  the  maximum  likelihood  estimator 
(MLE).  Iterative  techniques  which  are  often  used  to  solve  these  equations  suffer  from 
convergence  problems  for  short  data  records. 

Several  non-Gaussian  PDF’s  have  been  proposed  to  model  the  driving  noise.  Zero- 
mean  symmetric  PDF’s  having  tails  heavier  than  the  Gaussian  tail  are  of  particular 
interest  because  they  may  model  a  nominally  Gaussian  background  with  occasional  im¬ 
pulses.  Such  noise  processes  sire  encountered  in  low-frequency  atmospheric  communica¬ 
tions  [Bernstein  1974],  sonar  and  radar  problems  and  so  on.  Examples  of  heavy-tailed 
PDF’s  are  the  mixed- Gaussian  PDF,  the  Johnson  family  and  Middleton’s  class  A  and 
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B  PDF  families.  All  of  them  lead  to  a  complicated  likelihood  function  which  is  difficult 
to  maximize. 

This  paper  suggests  a  method  to  estimate  the  AR  filter  parameters  of  a  non- 
Gaussian  process  in  a  computationally  simple  way.  Essentially  it  is  a  two-stage  proce¬ 
dure  based  on  an  approximation  of  the  MLE.  The  resulting  estimator  is  asymptotically 
efficient  in  the  sense  that  its  variance  approaches  the  CR  bound  for  large  data  records. 
The  mixed-Gaussian  PDF  is  used  to  illustrate  the  approach  and  to  demonstrate  some 
of  the  finer  aspects. 

The  paper  is  organized  as  follows.  Section  II  gives  an  interpretation  of  the  MLE 
which  forms  the  basis  for  the  development  of  the  new  estimator.  Several  special  cases 
axe  discussed  to  illustrate  the  central  argument.  Section  IH  suggests  an  approximation 
to  the  MLE  based  on  this  interpretation.  Section  IV  actually  implements  such  an 
estimator  for  the  case  of  a  mixed-Gaussian  distribution.  Section  V  discusses  the  case 
when  some  of  the  parameters  of  the  noise  PDF  are  unknown  while  section  VI  reports 
the  results  of  computer  simulations.  Section  VII  summarizes  the  main  results. 

n.  An  Interpretation  of  the  MLE 

Consider  N  observations  of  an  AR(p)  process 


p 


(1) 


where  the  driving  noise  un  has  the  PDF  /(un;  0)  dependent  on  the  parameter  vector 
0.  /  is  assumed  to  be  an  even  function  and  hence  zero  mean.  It  is  also  assumed  that  f 
has  tails  heavier  than  a  Gaussian  PDF  g  having  equal  variance,  t.e.,  there  is  a  number 
U  such  that 


/(un)  >  g(un)  for  |un|  >  U 


(2) 


subject  to  the  constraint 
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The  log  likelihood  function  for  the  AR  filter  parameters  is  given  by  the  joint  PDF 
of  {xi ,  x2,  •  •  •  >  zjv}  when  the  xt-’s  are  replaced  by  their  observed  values.  This  is  difficult 
to  evaluate.  It  is  a  common  practice  to  replace  the  exact  likelihood  function  by  the 
conditional  liklihood  of  {xp+i,Xp+2>  •  •  •  given  {x*,  x2,  •  •  • ,  xp}  for  the  purpose  of 
maximization  over  the  parameters.  It  can  be  easily  shown  [Box  and  Jenkins  1970)  that 
the  conditional  log  likelihood  function  is  given  by 


N 


Inf  =  ^2  In /(“«;©) 

n=p+ 1 

where  clq  =  1.  Differentiating  with  respect  to  ay 
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(4) 

(5) 


The  MLE  of  a  =  [ai  a2  •  •  •  ap]  is  found  by  solving 


N 


^  ^  ®n— yttf»r(Wn) 
r»=p+l 


«n=ELo 


=  0,  J  =  1, 2,  •  •  •  p 


(6) 


provided  0  is  either  known  or  replaced  by  its  MLE  0  in  order  to  calculate  T(un)  from 
(5).  From  this  point  onwards  0  will  be  assumed  to  be  known.  It  will  be  shown  in 
section  V  that  for  some  PDF’s  the  method  to  be  described  can  be  implemented  with 
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0  replaced  by  a  reasonable  estimate.  Note  that  since  /  is  assumed  to  be  a  symmetric 
PDF,  /'  is  an  odd  function  of  un.  Therefore  f'/un  is  even  in  un  and  T(tin)  given  by 

(5)  is  an  even  function.  r(un)  is  assumed  to  exist  over  the  domain  of  /(un;  0).  (6)  can 
also  be  written  as 


N 


^  ^  ^  t^n— yr(un) 

i=0  n—p+ 1 


U»=£,P=0  a>Xn-i 


=  0,  y  =  i,2,.--P 


(7) 


For  the  special  case  of  a  Gaussian  PDF  with  variance  a2,  /' / /  =  -un/cr2.  There¬ 
fore  T(un)  =  l/<72  and  (7)  reduces  to 

P  AT 


xn-»Xn_y-0,  J  —  1,2,  •••p 

t=0  n=p+l 


(8) 


which  can  be  recognized  as  the  covariance  method  of  linear  prediction,  known  to  be 
approximately  the  MLE  in  the  Gaussian  case.  It  is  the  solution  of  a  linear  least  squares 
(LS)  problem  [Box  and  Jenkins  1970],  [Kay  1986] 

N  /  p  N  2 


"  /  P  y 

n=p+l  \  y=l  / 


(9) 


(7)  resembles  the  solution  of  a  LS  problem  except  for  the  weighting  factors  T(un).  Note 
that  the  inherent  dependence  of  un  (and  hence  r(un))  on  a,  the  AR  filter  parameters, 
makes  it  a  non-linear  problem.  If,  however,  the  argument  of  T  is  calculated  using  a  fixed 
(and  hopefully  an  approximate)  value  of  a,  (7)  becomes  the  solution  of  the  following 
weighted  LS  problem 


N  /  P  \2 

nun  ^  ^  (  xn  +  }  >yxn_y  j  T(un) 
n=p+l \  y=l  / 


(10) 


tin,  a  quantity  expected  to  be  close  to  un  is  defined  by  (l) 


«n  =  y^dyJn-y,  n  =  p-f-l,p  +  2,---JV 

j=o 


(11) 
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where  some  fixed  approximate  values  of  the  AR  filter  parameters  are  used,  do  is  defined 
to  be  unity.  (10)  is  minimized  by  (7)  with  T(un)  replaced  by  T(un)  which  reduces 
to  a  set  of  linear  equations  whose  unique  solution  can  be  expected  to  be  close  to 
the  MLE  of  a.  The  resulting  estimator  should  be  much  better  than  the  unweighted 
LS  estimator  (resulting  from  (9))  because  it  retains  the  general  shape  of  T(un)  by 
approximating  it  with  r(un).  The  sequence  {un  I  p  +  1  <  n  <  N }  can  be  generated 
by  passing  {xp+i,xp+2,-  •  •  ,iv}  through  a  moving  average  (MA)  filter  as  per  (11) 
with  coefficients  obtained  from  a  preliminary  stage  of  least  squares  estimation  ( e.g .,  by 
covariance  method).  In  other  words,  un  becomes  an  estimate  of  the  nth  sample  of  the 
driving  noise  based  on  an  LS  estimate  of  the  filter  parameters  such  as  the  covariance 
method.  This  approach  leads  to  an  approximate  MLE  which  is  described  in  the  next 
section. 

It  is  of  interest  to  know  how  the  terms  of  (10)  are  actually  weighted.  Three  sym¬ 
metric  PDF’s  which  are  commonly  used  to  model  heavy-tailed  non-Gaussian  processes 
[Czamecki  and  Thomas  1982],  [Middleton  1977],  [Johnson  and  Kotz  1971]  are  now  con¬ 
sidered.  The  plots  of  the  weighting  function  T(u)  for  these  PDF’s  provide  insight  into 
the  structure  of  the  MLE. 

Mixed- Gaussian  Model:  The  mixed-Gaussian  PDF  has  received  considerable  atten¬ 
tion  in  situations  where  the  underlying  random  process  is  characterized  by  the  presence 
of  occasional  impulses  in  an  otherwise  Gaussian  process.  The  PDF  is  given  by 

/(u)  =  (1  -  c)£b(u)  +  e£j(u),  0  <  e  <  1  (12) 

where  Eb  and  Ei  are  Gaussian  PDF’s  with  parameters  [mb, 0#]  and  respec¬ 

tively.  Assuming  «  a],  the  fraction  e  can  be  thought  of  as  the  degree  of  con¬ 
tamination  of  the  low-variance  Gaussian  process  with  PDF  Eb  by  the  high-variance 
component  with  PDF  Ei.  Only  the  zero-mean  case  (/lib  =  0 ,m  =0)  will  be  considered 
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here.  (12)  is  explicitly  written  as 


(13) 


For  this  PDF  T(u)  can  be  shown  to  be 


(14) 


where  p  =  o}/a\  and  u  =  u/aB .  Figure  1(a)  plots  T(u)/r(0)  vs.  u  (=  u  normalized 


by  <7B)  for  p  =  100.  Curves  for  different  values  of  e  are  overlayed.  Figure  1(b)  plots 


r(u)/r(0)  vs.  u  for  e  —  0.1  and  different  values  of  p.  The  curves  show  that  T(u)  acts 
as  a  limiter.  The  squared  errors  in  (10)  with  large  values  of  un  are  suppressed.  This 
makes  intuitive  sense  because  laxge  values  of  un  ( spikes  at  the  input  of  the  AR  filter) 
would  otherwise  dominate  the  sum  of  the  squares  and  consequently  the  information 
contained  in  the  rest  of  the  terms  will  be  lost.  Quantitatively,  from  (14)  with  p»l, 
T(0)  «  1/og  and  T(u)  — ►  1  /o]  as  u  — ►  oo,  t'.e.,  very  small  and  very  large  terms  in  (10) 
are  scaled  in  the  inverse  ratio  of  background  and  interference  noise  powers,  respectively. 
This  is  in  accordance  with  analogous  results  in  optimal  weighted  least  squares  theory 
[Sorenson  1980]. 

Middleton’s  Class  A  Model:  Another  physically  motivated  model  to  represent 
nominally  Gaussian  noise  with  an  impulsive  component  is  Middleton’s  class  A  PDF 
given  by  the  infinite  sum 


where  0  <  A  <  1  and  Em(u)  is  a  zero-mean  Gaussian  PDF  with  variance  given  by 


(16) 
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with.  B  a  constant.  The  Gaussian  component  corresponding  to  m  =  0  has  the  least 
variance  and  can  be  thought  of  as  the  background  process.  The  weights  of  the  higher 


order  terms  can  be  controlled  by  the  constant  A.  A  small  value  of  A  will  diminish  the 


contribution  of  the  contaminating  high-order  terms.  The  constant  B  can  be  used  to 
adjust  the  variance  of  thesecomponents.  The  overall  variance  of  the  Middleton’s  class 
A  PDF  is  o2.  T(u)  can  be  written  in  this  case  as 


(17) 


Figures  2(a)  and  2(b)  plot  T(u)/r(0)  vs.  u  for  this  PDF  for  different  values  of  A  and 


B.  o2  is  assumed  to  be  unity.  These  curves  also  are  observed  to  be  similar  to  a  limiter 


curve.  A  larger  value  of  A  implies  an  increased  presence  of  high-variance  components. 
Therefore  a  smaller  threshold  is  necessary  above  which  the  squared  terms  of  (10)  need 


to  be  down-weighted  in  order  to  preserve  the  information  in  the  remaining  terms.  This 


is  reflected  in  Figure  2(a)  which  clearly  exhibits  a  smaller  threshold  for  a  higher  A. 
Also,  a  larger  value  of  B  implies  less  difference  in  the  variances  of  the  Gaussian  terms 
corresponding  to  m  =  0  and  m  >  0,  i.e.,  a  smaller  deviation  from  Gaussianity.  A 
smaller  value  of  B  indicates  more  non-Gaussianity  and  hence  a  smaller  threshold  is 
necessary,  which  is  confirmed  by  Figure  2(b). 

Johnson  Family:  The  Johnson  family  of  PDF’s  is  one  of  the  heavy-tailed  families 
obtained  by  applying  a  transformation  to  a  Gaussian  random  variable.  If  v  is  a  Gaus¬ 
sian  random  variable  with  mean  zero  and  variance  one,  then  the  transformed  random 
variable 


has  a  PDF  belonging  to  the  Johnson  family  given  by 


(18) 
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t  is  chosen  to  be 


-.  1 
2 


t  = 


2  a2 

e(&)  -  1 


so  that  the  density  has  a  variance  a2.  The  parameter  t  can  be  used  to  control  the 
heaviness  of  tail.  A  smaller  value  of  t  implies  a  heavier  tail.  The  PDF  approaches  a 
Gaussian  one  as  6  — >  oo.  T(u)  can  be  written  in  this  case  as 


1  +  ^ 


u2l~*<52 

— -  smh  1 
ut 


(19) 


Figure  3  plots  T(u)/r(u)  vs.  u/a  for  different  values  of  t.  A  larger  value  of  t  indicates 
less  deviation  from  Gaussianity  confirmed  by  a  gradual  decrease  of  the  curve  from  the 
value  at  u  =  0.  Smellier  values  of  t  correspond  to  a  sharper  transition  and  a  smaller 
threshold. 

All  these  illustrations  show  that  the  MLE  given  as  a  solution  of  (7)  actually 
downweights  the  larger  squared  terms  and  may  therefore  be  well  approximated  by 
an  appropriate  weighted  LS  estimator.  The  following  section  elaborates  on  this  point. 
It  should  also  be  noted  that  T(u)  is  positive  in  all  the  above  cases.  This  is  true  for  any 
symmetric  PDF  which  is  a  monotonically  decreasing  function  of  u  for  positive  values 
of  u,  as  may  be  verified  from  (5).  Most  of  the  common  PDF’s  have  this  property. 


IQ.  An  approximation  to  the  MLE 

It  was  suggested  in  the  previous  section  that  the  problem  of  solving  the  set  of 
highly  non-linear  equations  (7)  can  be  replaced  by  solving  a  set  of  linear  equations  if 
the  weighting  function  T(un)  is  known  or  can  be  estimated  for  each  n.  Specifically,  this 
suggests  a  two-stage  procedure.  The  first  stage  involves  computation  of  the  unweighted 
LS  estimates  of  the  unknown  filter  parameters.  These  crude  estimates  can  be  used  to 
estimate  un  as  per  (11)  and  hence  r(un)  required  for  the  second  stage  of  weighted  LS 
estimation.  This  procedure  would  eliminate  convergence  problems  and  much  of  the 
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computational  complexity  associated  with  the  MLE.  Yet  it  is  apparent  from  (14),  (17) 
and  (19)  that  even  if  un  were  known  computation  of  the  weighting  function  might  be 
difficult  for  many  heavy-tailed  non-Gaussian  PDF’s.  Typically  the  weighting  function 
involves  computation  of  transcendental  functions  such  as  exponentials  which  may  be 
computationally  burdensome.  The  problem  would  be  simplified  considerably  if  T  could 
be  approximated  by  a  simple  function  whose  characteristics  depended  on  the  PDF 
parameters.  A  possible  approximation  of  the  weighting  curves  shown  in  Figures  1-3  is 
the  Butterworth  “filter” 


Ki 


1  + 

u 

7 

«c 

+  K2 


(20) 


where  uc  denotes  the  l3  dB  cutoff’,  /3  is  the  order  of  approximation  (not  necessarily 
an  integer)  and  Kx  >  0  and  K2  >  0  can  be  used  to  match  f  with  T  for  u  =  0  and 
u  f  •  All  these  parameters  can  be  used  to  produce  an  accurate  approximation  of 
the  usually  complicated  function  T.  The  second  stage  of  LS  can  therefore  be  simplified 
by  minimizing  (10)  with  T  replaced  by  f .  This  yields 


••P 


which  in  matrix  form  is 
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(  N 

l^n— if  (un) 

n-p+1 
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y  ^  Xn_2Xyi_if(un) 
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y  ^  Xn-pXn^lf(Un) 

J ^  n=p+l 
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y  ]  ^n— l^n— 2?  (^n) 

n=p+l 

TV 

E  *cn~2:cri“2-^(^^) 

n=p+l 

N 

E  xn-Pxri— 2^  (wn) 

n=p-hl 

x 


y  ^  l^n— pf  (^n) 

n=p-f  1 

N 

/  al\ 

y;  2^n— pf  (^n) 

a2 

n=p-(-l 

; 

N 

E  xn— p^n-pf (un) 

n=p+l  J 

\  ap  ' 

"  .  \ 
y  v  ^n^n— ir(un) 

n=p+l 

AT 

y:  ^n^n— 2f(^n) 

n=p+l 


JV 


y  ]  ^n^n— pf(ttn) 

\n=p+l 

(21) 


X  is  a  symmetric  p  X  p  matrix  which  is  positive  semidefinite.  To  show  this  assume 
b  =  [b\  62  •  *  *  6P]  is  a  vector  of  read  numbers.  Then, 

bTXb=EE^  E  xn-»Xn-jf(un) 

»=1  y=l  n=p+l 

=  ^  ^(^n)  YTMitn-jX*-,- 

n=p+l  t=l y=i 

TV  p  t2 

=  E  *(*»)  E6*1— 

n=p+l  L»=l 

>0 


since  the  weights  f(un)  axe  always  positive  (see  (20)).  (21)  can  be  solved  in  0(p3) 
operations.  A  Cholesky  decomposition  can  be  used  to  reduce  computation.  This  is 
in  contrast  to  the  unweighted  least  squares,  for  which  it  is  possible  to  estimate  the 
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parameters  in  0(p2)  operations  [Morf  et  cd  1977].  In  summary,  the  proposed  technique 


IS 


STEP  I :  Use  covariance  method  ((21)  with  f  (un)  =  1)  to  obtain  initial  estimates 


of  a. 


STEP  II .  Generate  the  sequence  un  by  passing  {xp+i,xp+2)  —  ,  x^}  through  the 
MA  filter  whose  coefficients  are  as  estimated  in  step  I,  as  given  by  (11). 


STEP  III :  Select  the  curve  f  (20)  by  choosing  appropriate  values  of  uc,  /?,  K\  and 


K-i  from  the  known  values  of  the  PDF  parameters  (0). 

STEP  TV :  Solve  for  a  from  (21). 

Step  m  will  be  different  for  different  non-Gaussian  PDF’s.  The  following  section 
addresses  this  part  of  the  problem  for  the  specific  case  of  a  mixed- Gaussian  distribution. 

IV.  Weighted  LS  for  Mixed-Gaussian  PDF 

Performance  of  the  weighted  LS  estimator  described  in  the  previous  section  is 
expected  to  be  dependent  on  how  well  the  curve  f  can  approximate  the  true  weighting 
curve  T.  Therefore  the  parameters  of  f ,  namely,  uc,  (3,  Kx  and  K2  should  be  chosen 
properly  for  every  set  of  values  of  the  parameters  of  the  PDF,  0.  In  the  mixed- 
Gaussian  case  the  parameters  e  and  p  determine  the  shape  of  T  (see  Figures  1(a)  and 
1(b)).  Ki  and  K2  can  be  found  as  a  function  of  these  two  parameters  by  matching  the 
values  of  T  and  f  for  u  =  0  and  u  — >  oo.  It  follows  from  (14)  assuming  =  l  that 


1 

P 


Also,  f  (0)  =  Ki  +  K2  and  f  (oo)  — ►  K2.  Therefore 


P 


12 


(22) 


Fortunately,  K\  and  K2  are  obtained  as  explicit  closed-form  functions  of  e  and  p.  0% 
has  been  assumed  to  be  unity  without  loss  of  generality,  since  it  will  only  change  uc  by 
a  scale  factor  and  furthermore  the  weighting  curve  need  only  be  determined  to  within 
a  scale  factor  for  use  in  (21).  uc  can  be  chosen  to  match  f  and  T  at  u  =  uc.  It  is  found 
by  solving 

r(uc)  = -^  +  #2  (23) 

where  T  and  is  defined  by  (14).  T  being  a  complicated  function,  (23)  can  only  be 
solved  by  a  search  algorithm.  Figure  4  plots  uc,  as  obtained  by  solving  (23),  vs.  e  for 
different  values  of  p.  The  curves  can  be  explained  by  interpreting  the  mixed-Gaussian 
PDF  as  one  arising  from  a  nominally  Gaussian  noise  contaminated  by  a  high  variance 
Gaussian  process.  A  small  value  of  e  implies  little  contamination  by  the  high  variance 
population  and  therefore  the  limiter  can  allow  for  reasonably  large  values  of  un  and 
hence  a  large  uc  results.  On  the  other  hand  as  e  increases,  increased  interference  from 
the  high  variance  population  is  compensated  for  by  making  the  threshold  uc  smaller 
so  as  not  to  allow  large  values  of  un  suppress  the  information  contained  in  the  other 
terms.  Figure  5  plots  uc  vs.  p  for  different  values  of  e.  It  shows  that  the  threshold 
is  minimum  for  p  «  10.  If  p  is  large,  the  contaminating  population  would  introduce 
very  large  spikes  and  therefore  a  high  threshold  would  suffice  to  suppress  them.  For 
a  smaller  ratio  of  Cj  to  Gg  a  smaller  threshold  is  necessary.  When  Oj  and  Og  are  of 
the  same  order  (p  <  10),  it  becomes  difficult  to  distinguish  between  contributions  from 
the  two  populations  and  therefore  most  of  the  terms  should  be  equally  weighted,  which 
is  accomplished  by  causing  the  threshold  to  be  large,  as  can  be  verified  from  Figure 
5.  An  interesting  special  case  is  p  =  1  when  the  mixed-Gaussian  PDF  degenerates  to 
a  Gaussian  PDF  (r(un)  =  1  fa\  for  all  a%).  The  threshold  goes  to  00  and  all  the 
terms  are  equally  weighted.  Once  the  threshold  is  calculated,  the  most  appropriate  (3 
for  a  given  e  and  p  can  be  found  by  a  least  squares  curve  fitting  method.  Specifically, 
a  suitable  range  of  u  (where  T(tt)  is  significantly  positive)  is  divided  in  1000  equally 
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spaced  points.  The  sum  of  (T(u)  -  f  (u))2  evaluated  at  these  points  is  then  minimized 
with  respect  to  (3.  Figure  6  plots  (3  vs.  e  for  different  values  of  p.  It  shows  that  a  sharp 
cutoff  (».e.,  a  high  value  of  (3)  is  necessary  only  when  the  contaminating  process  has 
high  power  and  it  appears  very  rarely  (large  p  and  small  e). 

A  typical  approximation  of  T(u)  by  f  (u)  has  been  plotted  in  Figure  7.  Both  the 
functions  are  plotted  vs.  u  in  the  same  scale.  o'g  —  1,  p  =  100  and  e  —  0.1  were 
assumed.  The  corresponding  threshold  uc  was  3.0224  and  the  most  suitable  / 3  was 
9.422.  Ki  =  0.979  and  K2  =  0.01  were  obtained  from  (22).  The  approximation  appers 
to  be  reasonably  accurate.  In  general,  accuracy  of  the  approximation  will  depend  on 
the  values  of  p  and  e. 

V.  The  case  of  unknown  PDF  parameters 

It  was  assumed  for  the  weighted  LS  estimator  described  in  section  m  that  0,  the 
vector  of  noise  PDF  parameters  was  known.  This  was  necessary  to  in  order  to  determine 
the  weighting  function  to  be  used  in  (21).  If  it  is  partially  or  completely  unknown,  it 
has  to  be  estimated.  This  will  undoubtedly  degrade  the  performance  of  the  estimator. 
It  will  be  shown  in  the  next  section  that  when  the  PDF  parameters  are  known,  the 
estimator  performance  of  the  weighted  LS  estimator  proposed  nearly  attains  the  CR 
bound.  Alternately,  the  performance  is  as  good  as  the  MLE.  Hence  for  the  purpose  of 
discussion  the  weighted  LS  estimator  can  be  considered  to  be  asymptotically  efficient 
when  the  PDF  parameters  are  known  so  that  T  is  known.  It  is  thus  of  interest  to 
determine  the  sensitivity  of  this  performance  to  changes  in  T  due  to  estimation  errors 
in  the  unknown  PDF  parameters.  One  way  of  quantifying  this  is  to  determine  the 
efficiency  of  the  corresponding  AR  filter  parameter  estimator  as  the  PDF  parameters 
vary  from  the  true  values  assumed  for  I\  The  problem  is  now  examined  from  the 
viewpoint  of  robust  M-estimators  [Martin  1979],  [Martin  and  Yohai  1984]. 

The  original  set  of  non-linear  equations  (6)  to  be  solved  for  the  MLE  (which  are 
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approximated  by  a  set  of  linear  equations  in  the  weighted  LS  method)  can  be  written 
as 


(24) 


where  <j>{un )  =  unr(un)  =  — /'(«„;  0)//(un;0  is  an  odd  function.  If  T(un)  is  as 
defined  in  (5)  with  the  true  PDF  parameters  then  the  solution  of  (24),  if  it  exists  and 
is  the  unique  maximum  of  the  likelihood  function,  is  the  MLE  of  a  (assuming  known 
PDF  parameters).  If  T(un)  is  replaced  by  a  different  limiter  curve  f  (un),  the  resulting 
estimator  is  termed  an  M-estimator  [Huber  1981],  When  the  true  PDF  is  not  perfectly 
known,  <f>  (or  T)  has  to  be  selected  on  the  basis  of  other  considerations,  c.  <7. ,  making 
the  estimator  performance  less  sensitive  to  the  PDF.  The  performance  of  an  estimator 
so  designed  is  not  as  good  as  the  MLE  (which  is  based  on  the  perfect  knowledge  of 
the  PDF),  but  assuming  f  is  well  chosen  its  performance  does  not  deteriorate  much  if 
the  actual  PDF  is  somewhat  different  from  the  PDF  which  produces  best  performance 
for  a  particular  selection  of  <f>.  Such  an  estimator  exhibits  efficiency  robustness  if  the 
performance  is  evaluated  in  terms  of  asymptotic  efficiency.  The  asymptotic  efficiency 
of  an  estimator  of  a  can  be  quantified  by  [Anderson  1971] 


matrix  of  a.  It  can  be  shown  that  0  <  EFF(§l,  /)  <  1  and  the  upper  bound  is  reached 
if  and  only  if  7(a)  =  7“^ (a).  In  the  case  of  M-estimates,  assuming  the  PDF  /  is 
symmetric,  (25)  reduces  to  [Martin  1979] 


(26) 


where 


(27) 
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The  asymptotic  efficiency  given  by  (26)  depends  on  how  well  the  function  (p  matches 
the  optimal  one  for  a  given  PDF  /.  It  attains  the  upper  bound  of  unity  if  only  if 
<t>  =  —f  If  or  alternately  (24)  represents  the  MLE  equations. 

For  a  Gaussian  M-estimator  (which  assumes  the  underlying  PDF  to  be  Gaussian 
with  zero  mean  and  variance  o2  for  the  purpose  of  choosing  <j>)  <f>  =  —f'/f  =  un/a2. 
Therefore  the  estimator  reduces  to  a  LS  estimator.  In  this  case,  assuming  the  true  PDF 
to  be  /,  (26)  implies 

EFF{&,f)  =  -L-  (28) 

It  is  known  [Sengupta  and  Kay  1986]  that  for  all  symmetric  PDF’s  a2//  >  1  with  the 
equality  holding  only  for  the  Gaussian  PDF.  Therefore  for  all  symmetric  non-Gaussian 
PDF  s  efficiency  of  the  LS  estimator  is  less  than  unity.  In  fact  the  LS  estimator  is 
known  to  be  severely  lacking  in  efficiency  robustness,  as  verified  by  Figures  8,  9,  10 
and  11  which  plot  the  asymptotic  efficiency  of  the  LS  estimator  given  by  (28)  for  the 
three  non-Gaussian  PDF’s  described  in  Section  II.  A  small  deviation  from  Gaussianity 
is  observed  to  produce  a  large  drop  in  asymptotic  efficiency.  For  example,  in  the 
case  of  a  mixed-Gaussian  PDF,  it  can  be  observed  from  Figure  8  that  p  =  100  and 
c  =  0-1  results  in  a  drop  by  a  factor  of  10  in  the  asymptotic  efficiency  from  the  value 
at  e  =  0  (which  corresponds  to  a  Gaussian  PDF).  Figure  9,  which  plots  the  asymptotic 
efficiency  of  the  LS  estimator  for  a  mixed-Gaussian  process  vs.  p  for  different  values 
of  6,  shows  that  the  estimator  loses  efficiency  for  moderately  large  values  of  p,  even 
when  e  is  reasonably  small.  In  the  cases  of  Middleton’s  class  A  and  Johnson’s  families 
Gaussianity  corresponds  to  large  values  of  B  and  t,  respectively.  In  both  cases  the 
asymptotic  efficiency  of  the  LS  estimator  drops  substantially  when  these  parameters 
are  smaller  (see  Figures  10  and  11). 

It  is  expected  that  a  wiser  choice  of  <p  (or  T)  in  (24)  would  result  in  a  better 
M-estimator.  If  the  PDF  parameters  (0)  were  known,  the  choice  T  =  -f'/uf  or  a 
suitable  approximation  T  thereof  would  have  been  optimal.  Since  these  parameters 
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axe  unknown,  a  selection  of  f  which  is  quite  appropriate  for  one  value  of  0  may  not 
be  suitable  for  other  values  of  it.  If  such  a  mismatch  does  not  reduce  the  asymptotic 
efficiency  substantially,  the  corresponding  M-estimator  would  be  considered  insensitive 
to  small  variations  in  the  PDF  parameters.  This  will  be  examined  by  plotting  the 
asymptotic  efficiency  of  a  nominal  M-estimator  as  the  true  PDF  parameter  values  vary 
from  the  values  for  which  the  chosen  estimator  is  optimal.  The  weighted  LS  estimator 
proposed  in  section  HI  may  be  viewed  as  an  M-estimator  where  un  (the  argument  of 
t)  is  replaced  by  a  preliminary  estimate  un.  Therefore  the  asymptotic  performance 
(in  terms  of  efficiency)  of  the  M-estimators,  which  will  now  be  described,  should  be  a 
good  indication  of  the  sensitivity  of  the  weighted  LS  estimator  to  changes  in  f  due  to 
estimation  errors  in  unknown  PDF  parameters. 

Figure  12  plots  the  asymptotic  efficiency  of  a  typical  M-estimator  for  different 
mixed-Gaussian  PDF’s.  A  fixed  limiter  curve  with  uc  =  3 ,  /?  =  10,  Ki  =  0.98  and 
Ki  —  0.01  is  used.  In  this  case 


<^(un)  —  Ujtf  (ttn) 


0.98un 


1  + 


Ur, 


10  d"  0.01un 


(29) 


c2b  is  assumed  to  be  unity  and  the  asymptotic  efficiency  (calculated  from  (26)  and  (29) 
by  numerical  integration)  is  plotted  vs.  e  for  different  values  of  p.  The  curve  corre¬ 


sponding  to  p  =  100  exhibits  a  maximum  (efficiency  «  1)  at  e  =  0.1  demonstrating  that 
a  mixed-Gaussian  PDF  with  c  =  0.1  and  p  —  100  is  most  suitable  for  this  M-estimator. 


In  fact,  these  values  of  e  and  p  were  actually  used  to  determine  f  as  described  in  Section 
IV  and  resulted  in  (29).  For  values  of  e  from  0  to  0.5  its  asymptotic  performance  is 
reasonably  good.  The  asymptotic  efficiency  is  0.98  at  e  =  0  (».«.,  the  PDF  is  Gaussian) 
and  0.90  at  e  =  0.5.  Therefore  for  a  truly  Gaussian  PDF  this  estimator  will  be  98%  as 
efficient  as  a  LS  estimator  (which  has  efficiency  of  one  for  a  Gaussian  PDF).  This  can 
be  thought  of  as  a  2%  premium  [Martin  1979]  for  a  90%  protection  or  coverage  against 
up  to  50%  outliers.  Figure  13  plots  the  asymptotic  efficiency  of  the  same  estimator  vs. 
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P  for  different  values  of  e.  It  shows  that  although  this  particular  choice  of  <p  is  most 
suitable  for  e  =  0.1  and  p  =  100,  the  asymptotic  efficiency  is  more  than  90%  up  to 

p  =  10000  for  values  of  e  less  than  0.1.  Improvement  over  the  curves  of  Figure  9  is 
quite  apparent. 

e  and  p  were  assumed  to  be  known  in  the  derivation  of  the  weighted  LS  estimator. 
Figures  12  and  13  show  that  the  asymptotic  efficiency  of  the  corresponding  M-estimator 
is  rather  insensitive  to  these  parameters.  Assuming  that  the  result  extends  to  the 
weighted  LS  estimator,  it  implies  that  when  these  parameters  are  not  known,  they  can 
be  approximated  by  crude  estimates  for  the  purpose  of  selecting  f  in  step  El  of  the 
suggested  estimation  procedure.  To  be  more  precise,  the  parameters  of  f  (namely,  uc, 
/?,  Ki  and  K^)  can  be  stored  in  a  table  as  functions  of  the  unknown  PDF  parameters 
and  the  proper  values  can  be  chosen  by  interpolation  from  these  tables.  Hardware 
memories  can  be  used  for  this  purpose  for  on-line  estimation.  The  resulting  estimator 
will  be  adaptive  in  nature  because  it  would  select  a  limiter  curve  depending  on  a  crude 
estimate  of  the  unknown  PDF  parameters.  This  result  adds  flexibility  to  the  method 
and  also  creates  a  possibility  of  estimating  the  unknown  PDF  parameters  more  precisely 
once  the  AR  filter  parameters  are  estimated  accurately. 

The  mixed-Gaussian  PDF  is  not  the  only  PDF  which  provides  such  an  opportunity. 
Figures  14  and  15,  which  plot  the  asymptotic  efficiency  of  the  M-estimator  (calculated 
with  a  typical  selection  of  f  in  each  case)  for  Middleton’s  class  A  and  Johnson  families, 
suggest  the  existence  of  similar  results  in  the  cases  of  other  non-Gaussian  PDF’s.  The 
parameters  of  f  chosen  for  Figure  13  were  uc  =  1.8,  (3  =  7,  Kx  =  0.955  and  FC2  =  0.045, 
which  are  quite  suitable  for  Middleton’s  class  A  PDF  with  A  =  0.05  and  B  =  1.0.  For 
smaller  values  of  B  the  M-estimator  shows  marked  improvement  over  the  LS  estimator 
(compare  Figure  10).  The  parameters  of  f  for  the  Johnson’s  family  were  chosen  to 
be  =  1.4,  (3  =  2,  K\  =  0.98  and  K%  —  0.02.  These  are  suitable  for  Johnson’s 
PDF  with  t  =  1.5.  A  comparison  of  Figures  11  and  15  reveals  that  the  M-estimator 
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does  not  lose  efficiency  as  fast  as  the  LS  estimator  as  t  becomes  smaller  (».e.  the 
PDF  becomes  more  non-Gaussian).  These  are  instances  of  the  M-estimator  being 
insensitive  to  small  variations  in  some  of  the  PDF  parameters.  It  is  not  clear  how 
well  these  results  apply  to  the  weighted  LS  estimator,  and  hence  its  improvement  over 
the  LS  estimator  may  be  somewhat  less  than  what  the  curves  show.  The  central 
argument  is  that  the  proposed  estimator  improves  the  efficiency  robustness  of  the  LS 
estimator  by  weighting  the  squared  terms,  and  it  also  reduces  computation  over  the 
M-estimator  by  approximating  the  argument  of  f .  Its  ability  to  handle  the  case  of 
unknown  PDF  parameters  makes  it  more  attractive  in  practice  than  an  M-estimator 
which  requires  the  solution  of  non-linear  equations.  The  following  section  presents  the 
results  of  computer  simulations  which  justify  the  approximations  made  in  deriving  the 
weighted  LS  estimator. 


VI.  Simulation  of  the  performance  of  the  weighted  LS  estimator 

Two  typical  AR(4)  processes  [Kay  1986]  was  chosen  for  computer  simulations.  The 
parameters  are  given  in  Table  A.  Process  I  is  broadband  while  process  II  is  narrowband. 
The  underlying  PDF  is  assumed  to  be  mixed-Gaussian  with  o%  =  1  and  p  =  100.  The 
mixture  parameter  is  e  =  0.1.  The  AR  process  was  generated  by  passing  a  white  mixed- 
Gaussian  process  through  a  filter,  allowing  sufficient  time  for  the  transients  to  decay. 
The  white  process  was  generated  by  randomly  selecting  from  two  mutually  independent 
white  Gaussian  processes  with  PDF’s  EB  and  Er  (having  variances  a\  and  o]  =  po\ 
respectively)  on  the  basis  of  a  series  of  Bernoulli  trials  with  probability  of  success  e. 
Thus  a  random  variable  could  be  expected  to  come  from  the  background  population  for 
(1-e)  fraction  of  times  and  from  the  contaminating  population  for  e  fraction  of  times. 
In  accordance  with  the  discussion  in  the  previous  section  one  of  the  PDF  parameters, 
namely  e,  was  assumed  to  be  unknown,  e  is  linearly  related  to  the  overall  variance  o2 
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°2  —  as[(  1  ~  e)  +  eP)] 


(30a) 


of  the  PDF, 

».e., 


(306) 

It  was  suggested  in  the  previous  section  that  a  crude  estimator  of  e  can  be  used  to 
select  the  proper  weighting  curve.  In  this  case  the  driving  noise  power,  ».e.,  a2  was 
estimated  along  with  a  in  the  first  step  of  unweighted  LS  estimation  using  covariance 
method  (see  (8))  and  e  was  calculated  from  this  estimate  using  (30b). 

Table  B  shows  the  sample  means  and  sample  variances  of  the  AR  filter  parameter 
estimators  obtained  by  the  exact  evaluation  of  the  approximate  MLE.  The  MLE  is 
found  by  the  four-dimensional  optimization  (for  the  four  AR  filter  parameters)  of  the 
conditional  likelihood  function  as  reported  in  [Sengupta  and  Kay  1986].  A  Newton- 
Raphson  iterative  procedure  was  used  for  this  purpose,  with  initial  conditions  obtained 
from  a  preliminary  stage  of  least  squares  estimation,  namely,  the  Forward-backward 
method  [Kay  1986].  The  value  of  e  used  in  (6)  was  as  obtained  from  o2  in  the  first  stage 
of  LS  estimation.  1000  data  points  were  used  and  the  result  is  based  on  500  experiments. 
The  results,  as  summarized  in  Table  D,  can  be  compared  to  the  performance  of  the 
Forward-Backward  estimator  (see  Table  C)  and  the  CR  bound.  The  MLE  achieves 
the  CR  bound  while  the  variances  of  the  Forward-Backward  estimators  are  larger  by 
a  factor  of  10.  This  agrees  with  the  theoretical  prediction  of  previous  section,  as  the 
asymptotic  efficiency  of  an  LS  estimator,  given  in  this  case  by  (28),  is  0.106  «  1/10. 
Table  D  reports  the  performance  of  the  weighted  LS  estimator.  The  loss  of  performance 
is  only  marginal,  as  is  verified  by  comparing  it  to  the  performance  of  the  exact  MLE. 

Evaluation  of  the  exact  MLE  is  not  only  computationally  intensive,  but  it  also 
suffers  from  convergence  problems.  For  1000  data  points  1%  of  the  experiments  failed 


1 

e  = - 

P-1 
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to  converge.  For  short  data  records  ( t.g .,  N  <  250)  it  almost  never  converges.  However 
the  weighted  LS  does  not  suffer  from  this  problem.  The  performance  of  the  unweighted 
and  weighted  LS  estimators  for  N  =  100  is  summarized  in  Tables  E  and  F,  respectively, 
along  with  the  CR  bound.  The  unweighted  LS  (Forward-Backward)  estimator  continues 
to  be  off  from  the  CR  bound  by  a  factor  of  10  for  Process  I.  The  offset  increases  to 
a  factor  of  15  for  Process  Et.  The  weighted  LS  suffers  from  a  slight  degradation  of 
performance:  it  is  off  from  the  CR  bound  by  a  factor  of  2  for  Process  I  and  by  a 
factor  of  2.5  for  Process  II.  This  is  probably  due  to  the  increased  inaccuracy  of  the 
simplifying  approximations  for  shorter  data  records.  It  still  exhibits  improvement  over 
the  Forward-Backward  estimator.  It  is  also  seen  to  have  less  bias  as  compared  to  the 
Forward-Backward  estimator  in  all  cases. 


VII.  Conclusions 

The  weighted  LS  estimator  proposed  in  this  paper  yields  accurate  estimates  of 
the  parameters  of  an  AR  process  excited  by  non- Gaussian  white  noise.  The  method 
utilizes  the  partial  information  available  about  the  noise  PDF  (principally  the  form  of 
the  PDF  to  within  a  set  of  unknown  parameters)  and  should  thereby  outperform  the 
so-called  robust  estimators.  It  also  reduces  computation  by  avoiding  solution  of  non¬ 
linear  equations  required  by  the  MLE  or  a  robust  estimator.  Computer  simulations 
have  justified  the  assumptions  made  in  determining  the  estimator.  The  new  technique 
does  not  suffer  from  convergence  problems,  and  exhibits  only  a  slight  departure  in 
performance  from  the  CR  bound  for  short  data  records.  The  weighted  LS  estimator  for 
the  AR  filter  parameters  can  be  used  in  conjunction  with  other  estimation  techniques 
directed  towards  assessment  of  unknown  PDF  parameters.  In  some  situations  it  may 
tolerate  a  reasonable  inaccuracy  in  the  estimation  of  these  parameters  and  yet  produce 
an  accurate  estimate  of  the  AR  filter  parameters. 
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Table  A:  Parameters  of  the  AR  processes  used  for  simulation 


Process 

<*i 

a4 

poles 

i 

-1.352 

1.338 

-0.662 

0.240 

0.7exp[y27r(0.12)] 

0.7  exp[j'2x(0.21)j 

ii 

-2.760 

3.809 

-2.654 

0.924 

0.98exp[y27r(0.11)] 

0.98exp[j27r(0.14)] 

Table  B:  Performance  of  the  MLE,  N  =  1000 


True 

value 

Sample 

mean 

Bias2 

Sample 

variance 

Cramer- Rao 

bound 

*1 

-1.352 

-1.3527 

4.900  x  10-7 

1.0221  X  10“4 

1.0491  x  10"4 

Process 

02 

1.338 

1.3391 

1.210  x  10~6 

2.4601  x  10-4 

2.5961  x  10-4 

I 

03 

-0.662 

-0.6630 

1.000  X  HT6 

2.4251  X  10~4 

2.5961  x  10"4 

04 

0.240 

0.2404 

1.600  x  10"7 

1.1035  x  10"4 

1.0491  x  10“4 

Ol 

-2.760 

-2.7597 

9.000  x  10"8 

1.7445  X  10-6 

1.6278  x  10~5 

Process 

02 

3.809 

3.8081 

8.100  x  10“7 

9.0097  x  10-6 

8.0163  X  10-6 

II 

03 

-2.654 

-2.6530 

1.000  x  10"6 

9.1303  X  10~6 

8.0163  x  10“6 

04 

0.924 

0.9236 

1.600  x  10"  7 

1.8496  x  10"5 

1.6278  X  10“5 

Figure  1(a)  The  weighting  curve  for  Mixed-Gaussian  PDF 
for  different  e's 


Table  C:  Performance  of  the  Forward-Backward  Estimator,  N  =  1000 


True 

value 

Sample 

mean 

Bias2 

Sample 

variance 

Cramer-Rao 

bound 

a  i 

-1.352 

-1.3482 

1.444  x  10~6 

1.0197  x  io~3 

1.0491  x  10-4 

Process 

a2 

1.338 

1.3326 

2.916  x  10“6 

2.3822  X  10“3 

2.5961  x  10"4 

I 

<*3 

—0.662 

-0.6591 

8.410  x  10"6 

2.3531  X  10~3 

2.5961  x  10"4 

<*4 

0.240 

0.2382 

3.240  x  10-4 

9.6246  x  10“4 

1.0491  X  10~4 

al 

-2.760 

-2.7567 

1.089  x  10~5 

1.6569  X  10"4 

1.6278  X  10-6 

Process 

<*2 

3.809 

3.8001 

7.921  x  10~6 

8.3418  X  10“4 

8.0163  x  10~5 

II 

03 

-2.654 

-2.6447 

8.649  X  10"6 

8.4388  X  10-4 

8.0163  x  10-6 

04 

0.924 

0.9197 

1.849  x  10"6 

1.7083  x  10-4 

1.6278  x  10-5 

Table  D:  Performance  of  the  Weighted  LS  Estimator,  N  =  1000 


True 

value 

Sample 

mean 

Bias2 

Sample 

variance 

Cramer-Rao 

bound 

Ol 

-1.352 

-1.3525 

2.500  x  10~7 

1.1169  x  lO'4 

1.0491  x  10-4 

Process 

02 

1.338 

1.3388 

6.400  x  10~7 

2.6466  x  lO-4 

2.5961  x  10“4 

I 

03 

-0.662 

-0.6628 

6.400  x  10"7 

2.6389  x  10-4 

2.5961  x  10"4 

O4 

0.240 

0.2402 

4.000  x  10"8 

1.1049  x  10-4 

1.0491  x  10-4 

01 

-2.760 

-2.7595 

2.500  x  10- 7 

1.9003  x  10-6 

1.6278  x  10-6 

Process 

a3 

3.809 

3.8075 

2.250  x  10“6 

9.8729  x  10"6 

8.0163  x  lO-6 

II 

03 

-2.654 

-2.6524 

2.560  x  KT6 

1.0043  x  10“4 

8.0163  x  10"6 

04 

0.924 

0.9233 

4.900  X  10- 7 

2.0422  X  10-6 

1.6278  x  10“5 

Table  E:  Performance  of  the  Forward-Backward  Estimator,  N  =  100 


True 

value 

Sample 

mean 

Bias2 

Sample 

variance 

Cramer-Rao 

bound 

<*i 

-1.352 

-1.3328 

3.686  x  10-4 

9.7580  X  10"3 

1.0491  x  10~3 

Process 

a2 

1.338 

1.3095 

8.123  X  10~4 

2.1536  x  10"2 

2.5961  X  10~3 

I 

03 

-0.662 

-0.6383 

5.617  x  10~4 

2.0595  X  10"2 

2.5961  x  10"3 

o  4 

0.240 

0.2324 

5.776  x  10"5 

8.5016  X  10"3 

1.0491  x  10"3 

oi 

-2.760 

-2.7372 

5.198  X  KT4 

2.4807  X  10"3 

1.6278  X  10-4 

Process 

a7 

3.809 

3.7494 

3.552  X  10~3 

1.2929  x  10~2 

8.0163  x  10-4 

II 

03 

-2.654 

-2.5934 

3.672  X  10-3 

1.2943  x  10~2 

8.0163  x  10~4 

O4 

0.924 

0.8974 

7.076  x  10-4 

2.5026  x  10~3 

1.6278  X  10“4 

Table  F:  Performance  of  the  Weighted  LS  Estimator,  N  =  100 


True 

value 

Sample 

mean 

Bias3 

Sample 

variance 

Cramer-Rao 

bound 

<*1 

-r.352 

-1.3476 

1.936  X  10~6 

1.9844  x  10"3 

1.0491  X  10“3 

Process 

1.338 

1.3293 

7.569  x  10~6 

5.2554  x  10~3 

2.5961  x  10"3 

I 

03 

-0.662 

-0.6536 

7.056  X  10“5 

5.2139  x  10"3 

2.5961  x  10-3 

O4 

0.240 

0.2366 

1.156  x  lO"6 

2.0403  x  10~3 

1.0491  x  10“3 

Ox 

-2.760 

-2.7543 

3.249  X  10“5 

3.6109  x  10~4 

1.6278  x  10-4 

Process 

02 

3.809 

3.7936 

2.372  X  10-4 

1.9891  x  10"3 

8.0163  X  10"4 

II 

03 

-2.654 

-2.6380 

2.560  X  10~4 

2.0666  X  10"3 

8.0163  x  10~4 

04 

0.924 

0.9168 

5.184  x  10"6 

4.3322  x  10~4 

1.6278  x  10-4 
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Figure  1(b)  The  weighting  curve  for  Mixed-Gaussian  PDF 
for  different  's 
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Figure  2(a)  The  weighting  curve  for  Middleton's  class  A  PDF 
for  different  A's 
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Figure  2(b)  The  weighting  curve  for  Middleton's  class  A  PDF 
for  different  B's 
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Figure  3  Tile  weighting  curve  for  Johnson's  PDF 
for  different  t's 
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Figure  4 


Dependence  of  threshold  on  e 
in  the  Mixed-Gaussian  case 
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Dependence  of  threshold  on  p 
in  the  Mixed-Gaussian  case 
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Figure  6  Dependence  of  $  on  e  and  p 
in  the  Mixed-Gaussian  case 
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Figure  7  A  typical  approximation  for  the  weighting  curve 
in  the  Mixed-Gaussian  case 
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Figure  9  Asymptotic  efficiency  of  LS  estimator  vs.P 
for  the  Mixed-Gaussian  process 


°  '  3 

1 - 

2 

i 

1 

- 1 - 1 - 

-1 

-2 

10 

10 

10 

B 

1  10 

10 

Figure  10 

Asymptotic 

efficiency 

of  LS  estimator  vs.  B 

for  Middleton's  class 

A  process 

Asymptotic  efficiency 


o 

o 


t 


Figure  11  Asymptotic  efficiency  of  LS  estimator  vs.  t 
for  Johnson's  process 
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Figure  12  Asymptotic  efficiency  of  M-estimator  vs.  £ 
for  the  Mixed -Gaussian  process 
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Figure  13  Asymptotic  efficiency  of  M-estimator  vs.  p 
for  the  Mixed-Gaussian  process 
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Figure  15  Asymptotic  efficiency  of  M-estimator  vs.  t 
for  Johnson's  process 
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Figure  14  Asymptotic  efficiency  of  M-estimator  vs.  B 
for  Middleton's  class  A  process 
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of  Non-Gaussian  Autoregressive  Processes 
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Abstract 


A  new  technique  for  the  estimation  of  autoregressive  filter  parameters  of  a  non- 
Gaussian  autoregressive  process  is  proposed.  The  probability  density  function  of 
the  driving  noise  is  assumed  to  be  known.  The  new  technique  is  a  two-stage  pro¬ 
cedure  motivated  by  maximum  likelihood  estimation.  It  is  computationally  much 
simpler  than  the  maximum  likelihood  estimator  and  does  not  suffer  from  conver¬ 
gence  problems.  Computer  simulations  indicate  that  unlike  the  least  squares  or 
linear  prediction  estimators,  the  proposed  estimator  is  nearly  efficient,  even  for 
moderately  sized  data  records.  By  a  slight  modification  the  proposed  estimator 
can  also  be  used  in  the  case  when  the  parameters  of  the  driving  noise  probability 
density  function  are  not  known. 
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Figure  14  Asymptotic  efficiency  of  M-estimator  vs.  B 
for  Middleton's  class  A  process 
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Figure  13  Asymptotic  efficiency  of  M-estimator  vs.  p 
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Figure  10  Asymptotic  efficiency  of  LS  estimator  vs.  B 
for  Middleton’s  class  A  process 
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Figure  9  Asymptotic  efficiency  of  LS  estimator  vs.P 
for  the  Mixed-Gaussian  process 
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Figure  6  Dependence  of  $  on  e  and  p 
in  the  Mixed-Gaussian  case 
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Figure  7  A  typical  approximation  for  the  weighting  curve 
in  the  Mixed- Gaussian  case 
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Figure  4  Dependence  of  threshold  on  e 
in  the  Mixed-Gaussian  case 
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